Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 455495 |
| Missing cells | 143894 |
| Missing cells (%) | 1.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 69.5 MiB |
| Average record size in memory | 160.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 11 |
Type is highly correlated with Stay | High correlation |
Stay is highly correlated with Type | High correlation |
City_Code_Patient has 6689 (1.5%) missing values | Missing |
Stay has 137057 (30.1%) missing values | Missing |
case_id is uniformly distributed | Uniform |
case_id has unique values | Unique |
Reproduction
| Analysis started | 2021-04-05 16:15:24.580725 |
|---|---|
| Analysis finished | 2021-04-05 16:16:39.011706 |
| Duration | 1 minute and 14.43 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
df_index
Real number (ℝ≥0)
| Distinct | 318438 |
|---|---|
| Distinct (%) | 69.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 131930.0164 |
|---|---|
| Minimum | 0 |
| Maximum | 318437 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 11387 |
| Q1 | 56936.5 |
| median | 113873 |
| Q3 | 204563.5 |
| 95-th percentile | 295662.3 |
| Maximum | 318437 |
| Range | 318437 |
| Interquartile range (IQR) | 147627 |
Descriptive statistics
| Standard deviation | 90048.67961 |
|---|---|
| Coefficient of variation (CV) | 0.6825488399 |
| Kurtosis | -0.9591527657 |
| Mean | 131930.0164 |
| Median Absolute Deviation (MAD) | 68188 |
| Skewness | 0.454073103 |
| Sum | 6.00934628 × 1010 |
| Variance | 8108764699 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2 | < 0.1% |
| 105659 | 2 | < 0.1% |
| 119986 | 2 | < 0.1% |
| 122035 | 2 | < 0.1% |
| 124084 | 2 | < 0.1% |
| 126133 | 2 | < 0.1% |
| 128182 | 2 | < 0.1% |
| 130231 | 2 | < 0.1% |
| 99512 | 2 | < 0.1% |
| 101561 | 2 | < 0.1% |
| Other values (318428) | 455475 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1 | 2 | |
| 2 | 2 | |
| 3 | 2 | |
| 4 | 2 |
| Value | Count | Frequency (%) |
| 318437 | 1 | |
| 318436 | 1 | |
| 318435 | 1 | |
| 318434 | 1 | |
| 318433 | 1 |
| Distinct | 455495 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 227748 |
|---|---|
| Minimum | 1 |
| Maximum | 455495 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 22775.7 |
| Q1 | 113874.5 |
| median | 227748 |
| Q3 | 341621.5 |
| 95-th percentile | 432720.3 |
| Maximum | 455495 |
| Range | 455494 |
| Interquartile range (IQR) | 227747 |
Descriptive statistics
| Standard deviation | 131490.2248 |
|---|---|
| Coefficient of variation (CV) | 0.5773496354 |
| Kurtosis | -1.2 |
| Mean | 227748 |
| Median Absolute Deviation (MAD) | 113874 |
| Skewness | -1.170820996 × 1015 |
| Sum | 1.037380753 × 1011 |
| Variance | 1.728967921 × 1010 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 2049 | 1 | < 0.1% |
| 238369 | 1 | < 0.1% |
| 248620 | 1 | < 0.1% |
| 258859 | 1 | < 0.1% |
| 260906 | 1 | < 0.1% |
| 254761 | 1 | < 0.1% |
| 256808 | 1 | < 0.1% |
| 234279 | 1 | < 0.1% |
| 236326 | 1 | < 0.1% |
| 230181 | 1 | < 0.1% |
| Other values (455485) | 455485 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 |
| Value | Count | Frequency (%) |
| 455495 | 1 | |
| 455494 | 1 | |
| 455493 | 1 | |
| 455492 | 1 | |
| 455491 | 1 |
Hospital_code
Real number (ℝ≥0)
| Distinct | 32 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.32633509 |
|---|---|
| Minimum | 1 |
| Maximum | 32 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 19 |
| Q3 | 26 |
| 95-th percentile | 30 |
| Maximum | 32 |
| Range | 31 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.63403567 |
|---|---|
| Coefficient of variation (CV) | 0.4711272401 |
| Kurtosis | -1.138349673 |
| Mean | 18.32633509 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | -0.2820898336 |
| Sum | 8347554 |
| Variance | 74.54657196 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 26 | 47523 | 10.4% |
| 23 | 38220 | 8.4% |
| 19 | 30036 | 6.6% |
| 6 | 29221 | 6.4% |
| 11 | 24827 | 5.5% |
| 14 | 24715 | 5.4% |
| 28 | 24572 | 5.4% |
| 27 | 20243 | 4.4% |
| 9 | 16360 | 3.6% |
| 12 | 16170 | 3.5% |
| Other values (22) | 183608 |
| Value | Count | Frequency (%) |
| 1 | 7460 | |
| 2 | 7277 | |
| 3 | 10277 | |
| 4 | 1749 | 0.4% |
| 5 | 7448 |
| Value | Count | Frequency (%) |
| 32 | 15252 | |
| 31 | 5740 | 1.3% |
| 30 | 7215 | 1.6% |
| 29 | 16158 | |
| 28 | 24572 |
Hospital_type_code
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| a | |
|---|---|
| b | |
| c | |
| e | |
| d | |
| Other values (2) |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 455495 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | c |
|---|---|
| 2nd row | c |
| 3rd row | e |
| 4th row | b |
| 5th row | b |
| Value | Count | Frequency (%) |
| a | 204730 | |
| b | 98884 | |
| c | 66147 | 14.5% |
| e | 35428 | 7.8% |
| d | 29048 | 6.4% |
| f | 15252 | 3.3% |
| g | 6006 | 1.3% |
| Value | Count | Frequency (%) |
| a | 204730 | |
| b | 98884 | |
| c | 66147 | 14.5% |
| e | 35428 | 7.8% |
| d | 29048 | 6.4% |
| f | 15252 | 3.3% |
| g | 6006 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 204730 | |
| b | 98884 | |
| c | 66147 | 14.5% |
| e | 35428 | 7.8% |
| d | 29048 | 6.4% |
| f | 15252 | 3.3% |
| g | 6006 | 1.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 455495 |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 204730 | |
| b | 98884 | |
| c | 66147 | 14.5% |
| e | 35428 | 7.8% |
| d | 29048 | 6.4% |
| f | 15252 | 3.3% |
| g | 6006 | 1.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 455495 |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 204730 | |
| b | 98884 | |
| c | 66147 | 14.5% |
| e | 35428 | 7.8% |
| d | 29048 | 6.4% |
| f | 15252 | 3.3% |
| g | 6006 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 455495 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 204730 | |
| b | 98884 | |
| c | 66147 | 14.5% |
| e | 35428 | 7.8% |
| d | 29048 | 6.4% |
| f | 15252 | 3.3% |
| g | 6006 | 1.3% |
City_Code_Hospital
Real number (ℝ≥0)
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.767797671 |
|---|---|
| Minimum | 1 |
| Maximum | 13 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 11 |
| Maximum | 13 |
| Range | 12 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.102450222 |
|---|---|
| Coefficient of variation (CV) | 0.6507092869 |
| Kurtosis | -0.6106581387 |
| Mean | 4.767797671 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.5434124607 |
| Sum | 2171708 |
| Variance | 9.625197382 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 79058 | |
| 2 | 74312 | |
| 6 | 67441 | |
| 7 | 50279 | |
| 3 | 45544 | |
| 5 | 44395 | |
| 9 | 37428 | |
| 11 | 24572 | 5.4% |
| 4 | 19778 | 4.3% |
| 10 | 7460 | 1.6% |
| Value | Count | Frequency (%) |
| 1 | 79058 | |
| 2 | 74312 | |
| 3 | 45544 | |
| 4 | 19778 | 4.3% |
| 5 | 44395 |
| Value | Count | Frequency (%) |
| 13 | 5228 | 1.1% |
| 11 | 24572 | |
| 10 | 7460 | 1.6% |
| 9 | 37428 | |
| 7 | 50279 |
Hospital_region_code
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| X | |
|---|---|
| Y | |
| Z |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 455495 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Z |
|---|---|
| 2nd row | Z |
| 3rd row | X |
| 4th row | Y |
| 5th row | Y |
| Value | Count | Frequency (%) |
| X | 190849 | |
| Y | 174707 | |
| Z | 89939 |
| Value | Count | Frequency (%) |
| x | 190849 | |
| y | 174707 | |
| z | 89939 |
Most occurring characters
| Value | Count | Frequency (%) |
| X | 190849 | |
| Y | 174707 | |
| Z | 89939 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 455495 |
Most frequent character per category
| Value | Count | Frequency (%) |
| X | 190849 | |
| Y | 174707 | |
| Z | 89939 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 455495 |
Most frequent character per script
| Value | Count | Frequency (%) |
| X | 190849 | |
| Y | 174707 | |
| Z | 89939 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 455495 |
Most frequent character per block
| Value | Count | Frequency (%) |
| X | 190849 | |
| Y | 174707 | |
| Z | 89939 |
Available Extra Rooms in Hospital
Real number (ℝ≥0)
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.196140463 |
|---|---|
| Minimum | 0 |
| Maximum | 24 |
| Zeros | 22 |
| Zeros (%) | < 0.1% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 2 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 24 |
| Range | 24 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.166993742 |
|---|---|
| Coefficient of variation (CV) | 0.3651259247 |
| Kurtosis | 2.549561887 |
| Mean | 3.196140463 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9589704305 |
| Sum | 1455826 |
| Variance | 1.361874393 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 140895 | |
| 4 | 131191 | |
| 3 | 130755 | |
| 5 | 27602 | 6.1% |
| 6 | 11003 | 2.4% |
| 1 | 7984 | 1.8% |
| 7 | 4107 | 0.9% |
| 8 | 1468 | 0.3% |
| 9 | 327 | 0.1% |
| 10 | 89 | < 0.1% |
| Other values (8) | 74 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 22 | < 0.1% |
| 1 | 7984 | 1.8% |
| 2 | 140895 | |
| 3 | 130755 | |
| 4 | 131191 |
| Value | Count | Frequency (%) |
| 24 | 1 | < 0.1% |
| 21 | 4 | |
| 20 | 2 | |
| 14 | 1 | < 0.1% |
| 13 | 3 |
Department
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| gynecology | |
|---|---|
| anesthesia | |
| radiotherapy | |
| TB & Chest disease | 13751 |
| surgery | 1665 |
Length
| Max length | 18 |
|---|---|
| Median length | 10 |
| Mean length | 10.41071581 |
| Min length | 7 |
Characters and Unicode
| Total characters | 4742029 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | radiotherapy |
|---|---|
| 2nd row | radiotherapy |
| 3rd row | anesthesia |
| 4th row | radiotherapy |
| 5th row | radiotherapy |
| Value | Count | Frequency (%) |
| gynecology | 356688 | |
| anesthesia | 42358 | 9.3% |
| radiotherapy | 41033 | 9.0% |
| TB & Chest disease | 13751 | 3.0% |
| surgery | 1665 | 0.4% |
| Value | Count | Frequency (%) |
| gynecology | 356688 | |
| anesthesia | 42358 | 8.5% |
| radiotherapy | 41033 | 8.3% |
| 13751 | 2.8% | |
| disease | 13751 | 2.8% |
| tb | 13751 | 2.8% |
| chest | 13751 | 2.8% |
| surgery | 1665 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| y | 756074 | |
| o | 754409 | |
| g | 715041 | |
| e | 525355 | |
| n | 399046 | |
| c | 356688 | |
| l | 356688 | |
| a | 180533 | 3.8% |
| s | 127634 | 2.7% |
| i | 97142 | 2.0% |
| Other values (11) | 473419 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4645772 | |
| Uppercase Letter | 41253 | 0.9% |
| Space Separator | 41253 | 0.9% |
| Other Punctuation | 13751 | 0.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| y | 756074 | |
| o | 754409 | |
| g | 715041 | |
| e | 525355 | |
| n | 399046 | |
| c | 356688 | |
| l | 356688 | |
| a | 180533 | 3.9% |
| s | 127634 | 2.7% |
| i | 97142 | 2.1% |
| Other values (6) | 377162 |
| Value | Count | Frequency (%) |
| T | 13751 | |
| B | 13751 | |
| C | 13751 |
| Value | Count | Frequency (%) |
| 41253 |
| Value | Count | Frequency (%) |
| & | 13751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4687025 | |
| Common | 55004 | 1.2% |
Most frequent character per script
| Value | Count | Frequency (%) |
| y | 756074 | |
| o | 754409 | |
| g | 715041 | |
| e | 525355 | |
| n | 399046 | |
| c | 356688 | |
| l | 356688 | |
| a | 180533 | 3.9% |
| s | 127634 | 2.7% |
| i | 97142 | 2.1% |
| Other values (9) | 418415 |
| Value | Count | Frequency (%) |
| 41253 | ||
| & | 13751 | 25.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4742029 |
Most frequent character per block
| Value | Count | Frequency (%) |
| y | 756074 | |
| o | 754409 | |
| g | 715041 | |
| e | 525355 | |
| n | 399046 | |
| c | 356688 | |
| l | 356688 | |
| a | 180533 | 3.8% |
| s | 127634 | 2.7% |
| i | 97142 | 2.0% |
| Other values (11) | 473419 |
Ward_Type
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| R | |
|---|---|
| Q | |
| S | |
| P | 7199 |
| T | 2133 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 455495 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | R |
|---|---|
| 2nd row | S |
| 3rd row | S |
| 4th row | R |
| 5th row | S |
| Value | Count | Frequency (%) |
| R | 182939 | |
| Q | 152046 | |
| S | 111166 | |
| P | 7199 | 1.6% |
| T | 2133 | 0.5% |
| U | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| r | 182939 | |
| q | 152046 | |
| s | 111166 | |
| p | 7199 | 1.6% |
| t | 2133 | 0.5% |
| u | 12 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| R | 182939 | |
| Q | 152046 | |
| S | 111166 | |
| P | 7199 | 1.6% |
| T | 2133 | 0.5% |
| U | 12 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 455495 |
Most frequent character per category
| Value | Count | Frequency (%) |
| R | 182939 | |
| Q | 152046 | |
| S | 111166 | |
| P | 7199 | 1.6% |
| T | 2133 | 0.5% |
| U | 12 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 455495 |
Most frequent character per script
| Value | Count | Frequency (%) |
| R | 182939 | |
| Q | 152046 | |
| S | 111166 | |
| P | 7199 | 1.6% |
| T | 2133 | 0.5% |
| U | 12 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 455495 |
Most frequent character per block
| Value | Count | Frequency (%) |
| R | 182939 | |
| Q | 152046 | |
| S | 111166 | |
| P | 7199 | 1.6% |
| T | 2133 | 0.5% |
| U | 12 | < 0.1% |
Ward_Facility_Code
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| F | |
|---|---|
| E | |
| D | |
| C | |
| B |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 455495 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | F |
|---|---|
| 2nd row | F |
| 3rd row | E |
| 4th row | D |
| 5th row | D |
| Value | Count | Frequency (%) |
| F | 161470 | |
| E | 79058 | |
| D | 74312 | |
| C | 50279 | 11.0% |
| B | 50116 | 11.0% |
| A | 40260 | 8.8% |
| Value | Count | Frequency (%) |
| f | 161470 | |
| e | 79058 | |
| d | 74312 | |
| c | 50279 | 11.0% |
| b | 50116 | 11.0% |
| a | 40260 | 8.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 161470 | |
| E | 79058 | |
| D | 74312 | |
| C | 50279 | 11.0% |
| B | 50116 | 11.0% |
| A | 40260 | 8.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 455495 |
Most frequent character per category
| Value | Count | Frequency (%) |
| F | 161470 | |
| E | 79058 | |
| D | 74312 | |
| C | 50279 | 11.0% |
| B | 50116 | 11.0% |
| A | 40260 | 8.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 455495 |
Most frequent character per script
| Value | Count | Frequency (%) |
| F | 161470 | |
| E | 79058 | |
| D | 74312 | |
| C | 50279 | 11.0% |
| B | 50116 | 11.0% |
| A | 40260 | 8.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 455495 |
Most frequent character per block
| Value | Count | Frequency (%) |
| F | 161470 | |
| E | 79058 | |
| D | 74312 | |
| C | 50279 | 11.0% |
| B | 50116 | 11.0% |
| A | 40260 | 8.8% |
Bed Grade
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 148 |
| Missing (%) | < 0.1% |
| Memory size | 3.5 MiB |
| 2.0 | |
|---|---|
| 3.0 | |
| 4.0 | |
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1366041 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
| Value | Count | Frequency (%) |
| 2.0 | 176451 | |
| 3.0 | 158942 | |
| 4.0 | 82387 | |
| 1.0 | 37567 | 8.2% |
| (Missing) | 148 | < 0.1% |
| Value | Count | Frequency (%) |
| 2.0 | 176451 | |
| 3.0 | 158942 | |
| 4.0 | 82387 | |
| 1.0 | 37567 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 455347 | |
| 0 | 455347 | |
| 2 | 176451 | 12.9% |
| 3 | 158942 | 11.6% |
| 4 | 82387 | 6.0% |
| 1 | 37567 | 2.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 910694 | |
| Other Punctuation | 455347 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 455347 | |
| 2 | 176451 | 19.4% |
| 3 | 158942 | 17.5% |
| 4 | 82387 | 9.0% |
| 1 | 37567 | 4.1% |
| Value | Count | Frequency (%) |
| . | 455347 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1366041 |
Most frequent character per script
| Value | Count | Frequency (%) |
| . | 455347 | |
| 0 | 455347 | |
| 2 | 176451 | 12.9% |
| 3 | 158942 | 11.6% |
| 4 | 82387 | 6.0% |
| 1 | 37567 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1366041 |
Most frequent character per block
| Value | Count | Frequency (%) |
| . | 455347 | |
| 0 | 455347 | |
| 2 | 176451 | 12.9% |
| 3 | 158942 | 11.6% |
| 4 | 82387 | 6.0% |
| 1 | 37567 | 2.8% |
patientid
Real number (ℝ≥0)
| Distinct | 131624 |
|---|---|
| Distinct (%) | 28.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 65786.79356 |
|---|---|
| Minimum | 1 |
| Maximum | 131624 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 6607 |
| Q1 | 32874 |
| median | 65735 |
| Q3 | 98576.5 |
| 95-th percentile | 125071.3 |
| Maximum | 131624 |
| Range | 131623 |
| Interquartile range (IQR) | 65702.5 |
Descriptive statistics
| Standard deviation | 37968.83085 |
|---|---|
| Coefficient of variation (CV) | 0.5771497408 |
| Kurtosis | -1.197556563 |
| Mean | 65786.79356 |
| Median Absolute Deviation (MAD) | 32852 |
| Skewness | 0.003135672615 |
| Sum | 2.996555553 × 1010 |
| Variance | 1441632116 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 66714 | 50 | < 0.1% |
| 91292 | 43 | < 0.1% |
| 38525 | 39 | < 0.1% |
| 114456 | 37 | < 0.1% |
| 101359 | 36 | < 0.1% |
| 33491 | 34 | < 0.1% |
| 32886 | 32 | < 0.1% |
| 6645 | 31 | < 0.1% |
| 99644 | 30 | < 0.1% |
| 31203 | 30 | < 0.1% |
| Other values (131614) | 455133 |
| Value | Count | Frequency (%) |
| 1 | 4 | |
| 2 | 2 | < 0.1% |
| 3 | 4 | |
| 4 | 2 | < 0.1% |
| 5 | 7 |
| Value | Count | Frequency (%) |
| 131624 | 3 | < 0.1% |
| 131623 | 2 | < 0.1% |
| 131622 | 4 | |
| 131621 | 3 | < 0.1% |
| 131620 | 9 |
| Distinct | 37 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 6689 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.249495328 |
|---|---|
| Minimum | 1 |
| Maximum | 38 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 8 |
| Q3 | 8 |
| 95-th percentile | 16 |
| Maximum | 38 |
| Range | 37 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 4.758940953 |
|---|---|
| Coefficient of variation (CV) | 0.6564513442 |
| Kurtosis | 4.516135526 |
| Mean | 7.249495328 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.601060026 |
| Sum | 3253617 |
| Variance | 22.64751899 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 176825 | |
| 2 | 55681 | 12.2% |
| 1 | 37772 | 8.3% |
| 7 | 33958 | 7.5% |
| 5 | 28978 | 6.4% |
| 4 | 22044 | 4.8% |
| 9 | 16692 | 3.7% |
| 15 | 12804 | 2.8% |
| 10 | 11809 | 2.6% |
| 6 | 8723 | 1.9% |
| Other values (27) | 43520 | 9.6% |
| Value | Count | Frequency (%) |
| 1 | 37772 | |
| 2 | 55681 | |
| 3 | 5401 | 1.2% |
| 4 | 22044 | 4.8% |
| 5 | 28978 |
| Value | Count | Frequency (%) |
| 38 | 18 | < 0.1% |
| 37 | 78 | |
| 36 | 29 | < 0.1% |
| 35 | 30 | < 0.1% |
| 34 | 96 |
Type of Admission
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| Trauma | |
|---|---|
| Emergency | |
| Urgent |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 7.108879351 |
| Min length | 6 |
Characters and Unicode
| Total characters | 3238059 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Emergency |
|---|---|
| 2nd row | Trauma |
| 3rd row | Trauma |
| 4th row | Trauma |
| 5th row | Trauma |
| Value | Count | Frequency (%) |
| Trauma | 217672 | |
| Emergency | 168363 | |
| Urgent | 69460 | 15.2% |
| Value | Count | Frequency (%) |
| trauma | 217672 | |
| emergency | 168363 | |
| urgent | 69460 | 15.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 455495 | |
| a | 435344 | |
| e | 406186 | |
| m | 386035 | |
| g | 237823 | |
| n | 237823 | |
| T | 217672 | |
| u | 217672 | |
| E | 168363 | 5.2% |
| c | 168363 | 5.2% |
| Other values (3) | 307283 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2782564 | |
| Uppercase Letter | 455495 | 14.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| r | 455495 | |
| a | 435344 | |
| e | 406186 | |
| m | 386035 | |
| g | 237823 | |
| n | 237823 | |
| u | 217672 | |
| c | 168363 | 6.1% |
| y | 168363 | 6.1% |
| t | 69460 | 2.5% |
| Value | Count | Frequency (%) |
| T | 217672 | |
| E | 168363 | |
| U | 69460 | 15.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3238059 |
Most frequent character per script
| Value | Count | Frequency (%) |
| r | 455495 | |
| a | 435344 | |
| e | 406186 | |
| m | 386035 | |
| g | 237823 | |
| n | 237823 | |
| T | 217672 | |
| u | 217672 | |
| E | 168363 | 5.2% |
| c | 168363 | 5.2% |
| Other values (3) | 307283 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3238059 |
Most frequent character per block
| Value | Count | Frequency (%) |
| r | 455495 | |
| a | 435344 | |
| e | 406186 | |
| m | 386035 | |
| g | 237823 | |
| n | 237823 | |
| T | 217672 | |
| u | 217672 | |
| E | 168363 | 5.2% |
| c | 168363 | 5.2% |
| Other values (3) | 307283 |
Severity of Illness
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| Moderate | |
|---|---|
| Minor | |
| Extreme |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.013381047 |
| Min length | 5 |
Characters and Unicode
| Total characters | 3194560 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Extreme |
|---|---|
| 2nd row | Extreme |
| 3rd row | Extreme |
| 4th row | Extreme |
| 5th row | Extreme |
| Value | Count | Frequency (%) |
| Moderate | 251565 | |
| Minor | 122735 | |
| Extreme | 81195 | 17.8% |
| Value | Count | Frequency (%) |
| moderate | 251565 | |
| minor | 122735 | |
| extreme | 81195 | 17.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 665520 | |
| r | 455495 | |
| M | 374300 | |
| o | 374300 | |
| t | 332760 | |
| d | 251565 | 7.9% |
| a | 251565 | 7.9% |
| i | 122735 | 3.8% |
| n | 122735 | 3.8% |
| E | 81195 | 2.5% |
| Other values (2) | 162390 | 5.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2739065 | |
| Uppercase Letter | 455495 | 14.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 665520 | |
| r | 455495 | |
| o | 374300 | |
| t | 332760 | |
| d | 251565 | 9.2% |
| a | 251565 | 9.2% |
| i | 122735 | 4.5% |
| n | 122735 | 4.5% |
| x | 81195 | 3.0% |
| m | 81195 | 3.0% |
| Value | Count | Frequency (%) |
| M | 374300 | |
| E | 81195 | 17.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3194560 |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 665520 | |
| r | 455495 | |
| M | 374300 | |
| o | 374300 | |
| t | 332760 | |
| d | 251565 | 7.9% |
| a | 251565 | 7.9% |
| i | 122735 | 3.8% |
| n | 122735 | 3.8% |
| E | 81195 | 2.5% |
| Other values (2) | 162390 | 5.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3194560 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 665520 | |
| r | 455495 | |
| M | 374300 | |
| o | 374300 | |
| t | 332760 | |
| d | 251565 | 7.9% |
| a | 251565 | 7.9% |
| i | 122735 | 3.8% |
| n | 122735 | 3.8% |
| E | 81195 | 2.5% |
| Other values (2) | 162390 | 5.1% |
Visitors with Patient
Real number (ℝ≥0)
| Distinct | 29 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.284229245 |
|---|---|
| Minimum | 0 |
| Maximum | 32 |
| Zeros | 34 |
| Zeros (%) | < 0.1% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 2 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 32 |
| Range | 32 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.768044196 |
|---|---|
| Coefficient of variation (CV) | 0.538343722 |
| Kurtosis | 21.82119341 |
| Mean | 3.284229245 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.225088709 |
| Sum | 1495950 |
| Variance | 3.125980278 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 197734 | |
| 4 | 113497 | |
| 3 | 84689 | |
| 6 | 27011 | 5.9% |
| 5 | 13314 | 2.9% |
| 8 | 6920 | 1.5% |
| 7 | 3556 | 0.8% |
| 9 | 1918 | 0.4% |
| 1 | 1776 | 0.4% |
| 10 | 1632 | 0.4% |
| Other values (19) | 3448 | 0.8% |
| Value | Count | Frequency (%) |
| 0 | 34 | < 0.1% |
| 1 | 1776 | 0.4% |
| 2 | 197734 | |
| 3 | 84689 | |
| 4 | 113497 |
| Value | Count | Frequency (%) |
| 32 | 12 | < 0.1% |
| 30 | 26 | < 0.1% |
| 29 | 10 | < 0.1% |
| 25 | 12 | < 0.1% |
| 24 | 99 |
Age
Categorical
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| 41-50 | |
|---|---|
| 31-40 | |
| 51-60 | |
| 21-30 | |
| 71-80 | |
| Other values (5) |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 4.984120572 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2270242 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 51-60 |
|---|---|
| 2nd row | 51-60 |
| 3rd row | 51-60 |
| 4th row | 51-60 |
| 5th row | 51-60 |
| Value | Count | Frequency (%) |
| 41-50 | 91495 | |
| 31-40 | 90420 | |
| 51-60 | 69506 | |
| 21-30 | 58560 | |
| 71-80 | 50737 | |
| 61-70 | 48619 | |
| 11-20 | 23871 | 5.2% |
| 81-90 | 11240 | 2.5% |
| 0-10 | 9140 | 2.0% |
| 91-100 | 1907 | 0.4% |
| Value | Count | Frequency (%) |
| 41-50 | 91495 | |
| 31-40 | 90420 | |
| 51-60 | 69506 | |
| 21-30 | 58560 | |
| 71-80 | 50737 | |
| 61-70 | 48619 | |
| 11-20 | 23871 | 5.2% |
| 81-90 | 11240 | 2.5% |
| 0-10 | 9140 | 2.0% |
| 91-100 | 1907 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 481273 | |
| 0 | 466542 | |
| - | 455495 | |
| 4 | 181915 | 8.0% |
| 5 | 161001 | 7.1% |
| 3 | 148980 | 6.6% |
| 6 | 118125 | 5.2% |
| 7 | 99356 | 4.4% |
| 2 | 82431 | 3.6% |
| 8 | 61977 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1814747 | |
| Dash Punctuation | 455495 | 20.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 481273 | |
| 0 | 466542 | |
| 4 | 181915 | 10.0% |
| 5 | 161001 | 8.9% |
| 3 | 148980 | 8.2% |
| 6 | 118125 | 6.5% |
| 7 | 99356 | 5.5% |
| 2 | 82431 | 4.5% |
| 8 | 61977 | 3.4% |
| 9 | 13147 | 0.7% |
| Value | Count | Frequency (%) |
| - | 455495 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2270242 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 481273 | |
| 0 | 466542 | |
| - | 455495 | |
| 4 | 181915 | 8.0% |
| 5 | 161001 | 7.1% |
| 3 | 148980 | 6.6% |
| 6 | 118125 | 5.2% |
| 7 | 99356 | 4.4% |
| 2 | 82431 | 3.6% |
| 8 | 61977 | 2.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2270242 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 481273 | |
| 0 | 466542 | |
| - | 455495 | |
| 4 | 181915 | 8.0% |
| 5 | 161001 | 7.1% |
| 3 | 148980 | 6.6% |
| 6 | 118125 | 5.2% |
| 7 | 99356 | 4.4% |
| 2 | 82431 | 3.6% |
| 8 | 61977 | 2.7% |
Admission_Deposit
Real number (ℝ≥0)
| Distinct | 7634 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4877.434022 |
|---|---|
| Minimum | 1800 |
| Maximum | 11920 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.5 MiB |
Quantile statistics
| Minimum | 1800 |
|---|---|
| 5-th percentile | 3360 |
| Q1 | 4184 |
| median | 4738 |
| Q3 | 5405 |
| 95-th percentile | 6918 |
| Maximum | 11920 |
| Range | 10120 |
| Interquartile range (IQR) | 1221 |
Descriptive statistics
| Standard deviation | 1084.982089 |
|---|---|
| Coefficient of variation (CV) | 0.2224493625 |
| Kurtosis | 1.854782006 |
| Mean | 4877.434022 |
| Median Absolute Deviation (MAD) | 603 |
| Skewness | 0.9313474371 |
| Sum | 2221646810 |
| Variance | 1177186.134 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 4469 | 445 | 0.1% |
| 4277 | 427 | 0.1% |
| 4624 | 408 | 0.1% |
| 4789 | 378 | 0.1% |
| 4400 | 340 | 0.1% |
| 4807 | 333 | 0.1% |
| 4970 | 332 | 0.1% |
| 4465 | 328 | 0.1% |
| 4603 | 309 | 0.1% |
| 4579 | 306 | 0.1% |
| Other values (7624) | 451889 |
| Value | Count | Frequency (%) |
| 1800 | 2 | |
| 1801 | 2 | |
| 1802 | 2 | |
| 1805 | 2 | |
| 1806 | 1 |
| Value | Count | Frequency (%) |
| 11920 | 1 | < 0.1% |
| 11293 | 1 | < 0.1% |
| 11008 | 4 | |
| 10999 | 2 | |
| 10842 | 1 | < 0.1% |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 137057 |
| Missing (%) | 30.1% |
| Memory size | 3.5 MiB |
| 21-30 | |
|---|---|
| 11-20 | |
| 31-40 | |
| 51-60 | |
| 0-10 | |
| Other values (6) |
Length
| Max length | 18 |
|---|---|
| Median length | 5 |
| Mean length | 5.207387309 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1658230 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0-10 |
|---|---|
| 2nd row | 41-50 |
| 3rd row | 31-40 |
| 4th row | 41-50 |
| 5th row | 41-50 |
| Value | Count | Frequency (%) |
| 21-30 | 87491 | |
| 11-20 | 78139 | |
| 31-40 | 55159 | |
| 51-60 | 35018 | 7.7% |
| 0-10 | 23604 | 5.2% |
| 41-50 | 11743 | 2.6% |
| 71-80 | 10254 | 2.3% |
| More than 100 Days | 6683 | 1.5% |
| 81-90 | 4838 | 1.1% |
| 91-100 | 2765 | 0.6% |
| (Missing) | 137057 |
| Value | Count | Frequency (%) |
| 21-30 | 87491 | |
| 11-20 | 78139 | |
| 31-40 | 55159 | |
| 51-60 | 35018 | |
| 0-10 | 23604 | 7.0% |
| 41-50 | 11743 | 3.5% |
| 71-80 | 10254 | 3.0% |
| days | 6683 | 2.0% |
| than | 6683 | 2.0% |
| 100 | 6683 | 2.0% |
| Other values (4) | 17030 | 5.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 399342 | |
| 0 | 351490 | |
| - | 311755 | |
| 2 | 165630 | |
| 3 | 142650 | 8.6% |
| 4 | 66902 | 4.0% |
| 5 | 46761 | 2.8% |
| 6 | 37762 | 2.3% |
| 20049 | 1.2% | |
| 8 | 15092 | 0.9% |
| Other values (13) | 100797 | 6.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1246230 | |
| Dash Punctuation | 311755 | 18.8% |
| Lowercase Letter | 66830 | 4.0% |
| Space Separator | 20049 | 1.2% |
| Uppercase Letter | 13366 | 0.8% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 399342 | |
| 0 | 351490 | |
| 2 | 165630 | |
| 3 | 142650 | 11.4% |
| 4 | 66902 | 5.4% |
| 5 | 46761 | 3.8% |
| 6 | 37762 | 3.0% |
| 8 | 15092 | 1.2% |
| 7 | 12998 | 1.0% |
| 9 | 7603 | 0.6% |
| Value | Count | Frequency (%) |
| a | 13366 | |
| o | 6683 | |
| r | 6683 | |
| e | 6683 | |
| t | 6683 | |
| h | 6683 | |
| n | 6683 | |
| y | 6683 | |
| s | 6683 |
| Value | Count | Frequency (%) |
| M | 6683 | |
| D | 6683 |
| Value | Count | Frequency (%) |
| - | 311755 |
| Value | Count | Frequency (%) |
| 20049 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1578034 | |
| Latin | 80196 | 4.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 399342 | |
| 0 | 351490 | |
| - | 311755 | |
| 2 | 165630 | |
| 3 | 142650 | 9.0% |
| 4 | 66902 | 4.2% |
| 5 | 46761 | 3.0% |
| 6 | 37762 | 2.4% |
| 20049 | 1.3% | |
| 8 | 15092 | 1.0% |
| Other values (2) | 20601 | 1.3% |
| Value | Count | Frequency (%) |
| a | 13366 | |
| M | 6683 | |
| o | 6683 | |
| r | 6683 | |
| e | 6683 | |
| t | 6683 | |
| h | 6683 | |
| n | 6683 | |
| D | 6683 | |
| y | 6683 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1658230 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 399342 | |
| 0 | 351490 | |
| - | 311755 | |
| 2 | 165630 | |
| 3 | 142650 | 8.6% |
| 4 | 66902 | 4.0% |
| 5 | 46761 | 2.8% |
| 6 | 37762 | 2.3% |
| 20049 | 1.2% | |
| 8 | 15092 | 0.9% |
| Other values (13) | 100797 | 6.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 MiB |
| Train | |
|---|---|
| Test |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 4.699103173 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2140418 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Train |
|---|---|
| 2nd row | Train |
| 3rd row | Train |
| 4th row | Train |
| 5th row | Train |
| Value | Count | Frequency (%) |
| Train | 318438 | |
| Test | 137057 |
| Value | Count | Frequency (%) |
| train | 318438 | |
| test | 137057 |
Most occurring characters
| Value | Count | Frequency (%) |
| T | 455495 | |
| r | 318438 | |
| a | 318438 | |
| i | 318438 | |
| n | 318438 | |
| e | 137057 | 6.4% |
| s | 137057 | 6.4% |
| t | 137057 | 6.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1684923 | |
| Uppercase Letter | 455495 | 21.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| r | 318438 | |
| a | 318438 | |
| i | 318438 | |
| n | 318438 | |
| e | 137057 | |
| s | 137057 | |
| t | 137057 |
| Value | Count | Frequency (%) |
| T | 455495 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2140418 |
Most frequent character per script
| Value | Count | Frequency (%) |
| T | 455495 | |
| r | 318438 | |
| a | 318438 | |
| i | 318438 | |
| n | 318438 | |
| e | 137057 | 6.4% |
| s | 137057 | 6.4% |
| t | 137057 | 6.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2140418 |
Most frequent character per block
| Value | Count | Frequency (%) |
| T | 455495 | |
| r | 318438 | |
| a | 318438 | |
| i | 318438 | |
| n | 318438 | |
| e | 137057 | 6.4% |
| s | 137057 | 6.4% |
| t | 137057 | 6.4% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | case_id | Hospital_code | Hospital_type_code | City_Code_Hospital | Hospital_region_code | Available Extra Rooms in Hospital | Department | Ward_Type | Ward_Facility_Code | Bed Grade | patientid | City_Code_Patient | Type of Admission | Severity of Illness | Visitors with Patient | Age | Admission_Deposit | Stay | Type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 8 | c | 3 | Z | 3 | radiotherapy | R | F | 2.0 | 31397 | 7.0 | Emergency | Extreme | 2 | 51-60 | 4911.0 | 0-10 | Train |
| 1 | 1 | 2 | 2 | c | 5 | Z | 2 | radiotherapy | S | F | 2.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 5954.0 | 41-50 | Train |
| 2 | 2 | 3 | 10 | e | 1 | X | 2 | anesthesia | S | E | 2.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 4745.0 | 31-40 | Train |
| 3 | 3 | 4 | 26 | b | 2 | Y | 2 | radiotherapy | R | D | 2.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 7272.0 | 41-50 | Train |
| 4 | 4 | 5 | 26 | b | 2 | Y | 2 | radiotherapy | S | D | 2.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 5558.0 | 41-50 | Train |
| 5 | 5 | 6 | 23 | a | 6 | X | 2 | anesthesia | S | F | 2.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 4449.0 | 11-20 | Train |
| 6 | 6 | 7 | 32 | f | 9 | Y | 1 | radiotherapy | S | B | 3.0 | 31397 | 7.0 | Emergency | Extreme | 2 | 51-60 | 6167.0 | 0-10 | Train |
| 7 | 7 | 8 | 23 | a | 6 | X | 4 | radiotherapy | Q | F | 3.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 5571.0 | 41-50 | Train |
| 8 | 8 | 9 | 1 | d | 10 | Y | 2 | gynecology | R | B | 4.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 7223.0 | 51-60 | Train |
| 9 | 9 | 10 | 10 | e | 1 | X | 2 | gynecology | S | E | 3.0 | 31397 | 7.0 | Trauma | Extreme | 2 | 51-60 | 6056.0 | 31-40 | Train |
Last rows
| df_index | case_id | Hospital_code | Hospital_type_code | City_Code_Hospital | Hospital_region_code | Available Extra Rooms in Hospital | Department | Ward_Type | Ward_Facility_Code | Bed Grade | patientid | City_Code_Patient | Type of Admission | Severity of Illness | Visitors with Patient | Age | Admission_Deposit | Stay | Type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 455485 | 137047 | 455486 | 9 | d | 5 | Z | 2 | gynecology | S | F | 4.0 | 55235 | 4.0 | Emergency | Moderate | 3 | 31-40 | 4418.0 | NaN | Test |
| 455486 | 137048 | 455487 | 13 | a | 5 | Z | 2 | gynecology | R | F | 3.0 | 55235 | 4.0 | Emergency | Moderate | 2 | 31-40 | 3816.0 | NaN | Test |
| 455487 | 137049 | 455488 | 12 | a | 9 | Y | 6 | gynecology | Q | B | 2.0 | 99515 | 7.0 | Emergency | Moderate | 4 | 61-70 | 4406.0 | NaN | Test |
| 455488 | 137050 | 455489 | 13 | a | 5 | Z | 3 | gynecology | R | F | 3.0 | 22878 | 21.0 | Emergency | Moderate | 3 | 21-30 | 4573.0 | NaN | Test |
| 455489 | 137051 | 455490 | 15 | c | 5 | Z | 2 | gynecology | S | F | 4.0 | 118215 | 6.0 | Urgent | Minor | 2 | 21-30 | 5241.0 | NaN | Test |
| 455490 | 137052 | 455491 | 11 | b | 2 | Y | 4 | anesthesia | Q | D | 3.0 | 41160 | 3.0 | Emergency | Minor | 4 | 41-50 | 6313.0 | NaN | Test |
| 455491 | 137053 | 455492 | 25 | e | 1 | X | 2 | radiotherapy | R | E | 4.0 | 30985 | 7.0 | Emergency | Moderate | 2 | 0-10 | 3510.0 | NaN | Test |
| 455492 | 137054 | 455493 | 30 | c | 3 | Z | 2 | anesthesia | R | A | 4.0 | 81811 | 12.0 | Urgent | Minor | 2 | 0-10 | 7190.0 | NaN | Test |
| 455493 | 137055 | 455494 | 5 | a | 1 | X | 2 | anesthesia | R | E | 4.0 | 57021 | 10.0 | Trauma | Minor | 2 | 41-50 | 5435.0 | NaN | Test |
| 455494 | 137056 | 455495 | 6 | a | 6 | X | 3 | gynecology | Q | F | 4.0 | 126729 | 3.0 | Trauma | Extreme | 5 | 51-60 | 4702.0 | NaN | Test |